Multimodal Language Processing For

نویسندگان

  • Michael Johnston
  • Srinivas Bangalore
  • Amanda Stent
  • Gunaranjan Vasireddy
  • Patrick Ehlen
چکیده

Interfaces for mobile information access need to allow users flexibility in their choice of modes and interaction style in accordance with their preferences, the task at hand, and their physical and social environment. This paper describes the approach to multimodal language processing in MATCH (Multimodal Access To City Help), a mobile multimodal speech-pen interface to restaurant and subway information for New York City. Finite-state methods for multimodal integration and understanding enable users to interact using pen, speech, or dynamic combinations of the two, and a speech-act based multimodal dialogue manager enables mixedinitiative multimodal dialogue. 1. LANGUAGE PROCESSING FOR MOBILE SYSTEMS Mobile information access devices (PDAs, tablet PCs, next generation phones) offer limited screen real estate and no keyboard or mouse, making complex graphical interfaces cumbersome. Multimodal interfaces can address this problem by enabling speech and pen input and output combining synthetic speech and graphics (See [1] for a detailed overview of previous work on multimodal input and output). Furthermore, since mobile devices are used in situations involving different physical and social environments, tasks, and users, they need to allow users to provide input in whichever mode or combination of modes are most appropriate given the situation and the user’s preferences. Our testbed multimodal application MATCH (Multimodal Access To City Help) allows all commands to be expressed either by speech, by pen, or multimodally. This is achieved by capturing the parsing, integration, and understanding of speech and gesture inputs in a single multimodal grammar which is compiled into a multimodal finite-state device. This device is tightly integrated with a speech-act based multimodal dialog manager enabling users to complete commands either in a single turn or over the course of a number of dialogue turns. In Section 2 we describe the MATCH application. In Section 3, we describe the multimodal language processing architecture underlying MATCH. 2. THE MATCH APPLICATION Urban environments present a complex and constantly changing body of information regarding restaurants, cinema and theatre schedules, transportation topology, and timetables. This information is most valuable if it can be delivered effectively while mobile, since users needs change while they are out and the information itself is dynamic (e.g. train times change and shows get cancelled). Thanks to AT&T labs and DARPA ITO (contract No. MDA972-99-30003) for financial support. MA enables for New pen com less net ing rest street in mands play wi of the t rants u

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multimodal Approach toward Teaching for Transfer: A Case of Team-Teaching in ESAP Writing Courses

This paper presents a detailed examination of learning transfer from an English for Specific Academic Purposes course to authentic discipline-specific writing tasks. To enhance transfer practices, a new approach in planning writing tasks and materials selection was developed. Concerning the conventions of studies in learning transfer that acknowledge different learning preferences, the instruct...

متن کامل

Language Technology – a Survey of the State of the Art Language Resources – Multimodal Language Resources

This article provides an overview of research in multimodal language processing and associated resources. It defines multimodal processing, describes key challenges, identifies potential benefits, and outlines the major tasks, including multimodal input interpretation, multimodal output generation, and multimodal information access. The article exemplifies the state of the art in multimedia and...

متن کامل

A Critical Visual Analysis of Gender Representation of ELT Materials from a Multimodal Perspective

This content analysis study, employing a multimodal perspective and critical visual analysis, set out to analyze gender representations in Top Notch series, one of the highly used ELT textbooks in Iran. For this purpose, six images were selected from these series and analyzed in terms of ‘representational’, ‘interactive’ and ‘compositional’ modes of meanings. The result indicated that there are...

متن کامل

The multimodal nature of spoken word processing in the visual world: Testing the predictions of alternative models of multimodal integration

Ambiguity in natural language is ubiquitous (Piantadosi, Tily & Gibson, 2012), yet spoken communication is effective due to integration of information carried in the speech signal with information available in the surrounding multimodal landscape. However, current cognitive models of spoken word recognition and comprehension are underspecified with respect to when and how multimodal information...

متن کامل

Achieving Multimodal Cohesion during Intercultural Conversations

How do English as a lingua franca (ELF) speakers achieve multimodal cohesion on the basis of their specific interests and cultural backgrounds? From a dialogic and collaborative view of communication, this study focuses on how verbal and nonverbal modes cohere together during intercultural conversations. The data include approximately 160-minute transcribed video recordings of ELF interactions ...

متن کامل

Multimodal signal processing in naturalistic noisy environments

When a system must process spoken language in natural environments that involve different types and levels of noise, the problem of supporting robust recognition is a very difficult one. In the present studies, over 2,600 multimodal utterances were collected during both mobile and stationary use of a multimodal pen/voice system. The results confirmed that multimodal signal processing supports s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002